Effective Instruction Scheduling With Limited Registers
نویسنده
چکیده
Effective global instruction scheduling techniques have become an important component in modern compilers for exposing more instruction-level parallelism (ILP) and exploiting the everincreasing number of parallel function units. Effective register allocation has long been an essential component of a good compiler for reducing memory references. While instruction scheduling and register allocation are both essential compiler optimizations for fully exploiting the capability of modern high-performance microprocessors, there is a phase-ordering problem when we perform these two optimizations separately: instruction scheduling before register allocation may create insatiable demands for registers; register allocation before instruction scheduling may reduce the amount of parallelism that instruction scheduling can exploit. In this thesis, we propose to solve this phase-ordering problem by inserting a moderating optimization called code reorganization between prepass instruction scheduling and register allocation. Code reorganization adjusts the prepass scheduling results to make them demand fewer registers (i.e. exhibit lower register pressure) and guides register allocation to insert spill code that has less impact on schedule length. Our new approach avoids the complexity of simultaneous instruction scheduling and register allocation algorithms. In fact, it does not modify either instruction scheduling or register allocation algorithms. Therefore instruction scheduling can focus on maximizing instruction-level parallelism, and register allocation can focus on minimizing the cost of spill code. We compare the performance of our approach with a particular successful register-pressure-sensitive scheduling algorithm, and show an average of 18% improvement in speedup for an 8-issue machine model with 32 integer registers. Moreover, we show that our approach is effective and feasible for the aggressive all-path scheduling approach that is otherwise difficult to make register-pressuresensitive.
منابع مشابه
Registers On Demand, an integrated region scheduler and register allocator
Two of the most important phases of code generation for instruction level parallel processors are register allocation and instruction scheduling. Applying these two phase separately has major drawbacks like the introduction of false dependences, or a higher register pressure and thus more spill code. In this paper we present a new method which integrates register allocation and region schedulin...
متن کاملMinimum Register Instruction Scheduling: A New Approach for Dynamic Instruction Issue Processors
Modern superscalar architectures with dynamic scheduling and register renaming capabilities have introduced subtle but important changes into the tradeoffs between compile-time register allocation and instruction scheduling. In particular, it is perhaps not wise to increase the degree of parallelism of the static instruction schedule at the expense of excessive register pressure which may resul...
متن کاملRemoving Anti Dependences by Repairing
1 I n t r o d u c t i o n Computer designers and computer architects have been striving to improve uniprocessor performance since the invention of computers. The next step in this quest for higher performance is the exploitation of significant amounts of instruction-level parallelism. Therefore, superscalar and VLIW (very large instruction word) machines have been designed, which can execute se...
متن کاملRegisters On Demand: Integrated register allocation and instruction scheduling
Two of the most important phases of code generation for instruction level parallel processors are register allocation and instruction scheduling. Applying one phase before the other has in both cases its drawbacks. In this paper we present a new method which integrates register allocation and scheduling, called Registers on Demand. We discuss register selection, spilling and the insertion of st...
متن کاملInstructions Scheduling for Highly Super-scalar Processors
Super-scalar processors can execute multiple instructions out-of-order per cycle and speculatively execute instructions through branches. Such processors invalidate many of the assumptions of traditional instruction scheduling. This article analyzes the impact of super-scalar processor architecture upon instruction scheduling. The compile-time schedule is shown to signiicantly impact performanc...
متن کامل